TAMU, 3 Sep. 2015
11:15-12:00
load('MissingData.RData')
table(apply(missdat$founders, 2, function(x) sum(is.na(x))))
## ## 0 1 2 3 4 5 ## 15 29 31 19 5 2
nmiss <- apply(missdat$finals, 1, function(x) sum(is.na(x))/length(x))
hist(nmiss, breaks=20, col="tomato")
impdat <- mpimpute(missdat)
## [1] "No chromosomes specified, will default to all" ## Using map groupings for groups. Remove map object if you want to regroup. ## --Read the following data: ## 200 individuals ## 101 markers ## 2 phenotypes
table(apply(impdat$founders, 2, function(x) sum(!is.na(x))))
## ## 8 ## 101
nmissi <- apply(impdat$finals, 1, function(x) sum(is.na(x))/length(x)) sum(nmissi>0)
## [1] 0
sum(is.na(impdat$founders))
## [1] 0
sum(impdat$founders!=dat$founders)
## [1] 0
sum(is.na(impdat$finals))/prod(dim(impdat$finals))
## [1] 0
sum(impdat$finals!=dat$finals, na.rm=T)/sum(is.na(missdat$finals))
## [1] 0.08916084
library(spclust)
load('SimulatedSP.RData')
selLines <- spclust(dat, nlines=20, method="average")
## [1] "No chromosomes specified, will default to all" ## Using map groupings for groups. Remove map object if you want to regroup. ## --Read the following data: ## 200 individuals ## 255 markers ## 2 phenotypes ## No required lines input; will only select a single-stage sample
plot(selLines, type=2)
ped4 <- sim.mpped(4, 3, 200) # MAGIC4RIL ped8 <- sim.mpped(8, 30, 200) # MAGIC8RIL, 30 funnels ped8ai2 <- sim.mpped(8, 1, 200, iripgen=2) # MAGIC8AI2RIL ped26nam <- generateNAMpedigree(26, 100) #NAMRIL
load('datfinalPart2.RData')
## Based just on highest probability allele
nrecEst <- lapply(mppEst$estfnd, function(x)
apply(x, 1, function(y) return(sum(diff(y[!is.na(y)])!=0))))
mean(rowSums(do.call("cbind", nrecEst)))
## [1] 12.44
## Errors in the map can cause additional recombination events
load('Part2.RData')
nrecTrue <- lapply(mppTrue$estfnd, function(x)
apply(x, 1, function(y) return(sum(diff(y[!is.na(y)])!=0))))
mean(rowSums(do.call("cbind", nrecTrue)))
## [1] 11.954
## Based on forward-backward algorithm with penalty
source('nrec.R')
mean(nrec(mppEst, penalty=0)$totrec)
## [1] 13.797
mean(nrec(mppEst, penalty=1)$totrec)
## [1] 12.055
mean(nrec(mppEst, penalty=2)$totrec)
## [1] 10.941
nr <- nrec(mppEst, penalty=1)
boxplot(do.call("cbind", nr$nrec), col="tomato")
mppEst$pheno$nrec <- nrec(mppEst, penalty=1)$totrec mprec <- mpIM(object=mppEst, responsename="nrec", ncov=0) ## No QTL found - but possible in real data
load('datfinalPart2.RData')
plot(datfinal)
mpp <- mpprob(datfinal, program="qtl") plot(mpp)
load('Part2.RData')
mc <- mapcomp(dat, datfinal)
summary(mc)
## Number of markers in map1 is 505 ## Number of markers in map2 is 505 ## Number of common markers is 505 ## Number of duplicated markers in map1 is 0 ## Number of duplicated markers in map2 is 0 ## Number of markers with differing chromosomes between maps is 0 ## Correlations between chromosomes are: ## 0.999142 0.991306 -0.999554 -0.9988986 0.9987963
plot(mc)
library(RCircos)
source('CircularIntx.R')
pmatrix <- matrix(runif(505*505, 0, 1), nrow=505)
plotCircIntx(pmatrix, threshold=1e-4, map=datfinal$map, file="CircEx.png")
Very basic simulation - generate 10 datasets each from 4-parent and 8-parent with these parameters - same map, 5 chromosomes, 51 markers per chromosome, 100 cM, evenly spaced - QTL + Chr 1, 47 cM, size=.5 + Chr 2, 13 cM, size=.3 + Chr 3, 81 cM, size=.1 - How often is each QTL detected?
For one of the 4-parent and 8-parent datasets generated for the simulation, what's the average number of recombinations?
Plot a histogram of the recombinations for each dataset.
Jin et al. 2004, Selective phenotyping for increased efficiency in genetic mapping studies. Genetics 168:2285-2293. doi: 10.1534/genetics.104.027524
Jannink 2005, Selective phenotyping to accurately map quantitative trait loci. Crop Science 45:901-908. doi: 10.2135/cropsci2004.0278
Hickey et al. 2014, AlphaMPSIM: flexible simulation of multi-parent crosses. Bioinformatics 30:2686-2688. doi: 10.1093/bioinformatics/btu206
Huang et al. 2013, Selecting subsets of genotyped experimental populations for phenotyping to maximize genetic diversity. TAG 126: 379-388. doi: 10.1007/s00122-012-1986-4
Huang et al. 2014, Efficient imputation of missing markers in low-coverage genotyping-by-sequencing data from multiparental crosses. Genetics 197:401-404. doi: 10.1534/genetics.113.158014
Kover et al. 2009, A multiparent advanced generation inter-cross to fine-map quantitative traits in Arabidopsis thaliana. PLoS Genetics doi: 10.1371/journal.pgen.1000551
Contact me: b.emma.huang@gmail.com
github.com/behuang/mpMap for updates